Newest 'scikit-learn cross-validation' Questions

1vote

0answers

102views

Confused about use of random states for training models in scikit

I am new to ML and currently working on improving the accuracy of an MLPClassifier in scikit. My code looks like so ...

Leandro

25

asked Sep 7, 2024 at 19:15

3votes

1answer

636views

Why does scikit's cross-validation return a negative R^2 for my strongly correlated data

I have exactly the following preprocessed data in a small Pandas dataframe: ...

Arepo

133

asked Aug 14, 2024 at 10:01

0votes

1answer

37views

Sklearn EstimatorCV vs GridSearchCV

sklearn has the following description for EstimatorCV estimators: https://scikit-learn.org/stable/glossary.html#term-cross-validation-estimator An estimator that has built-in cross-validation ...

wannabedatascientist

3

asked Jul 2, 2024 at 3:02

2votes

1answer

31views

Scoring function in cross-validation often left default

I'm a PhD student applying ML in microbiology. In research papers, the usual performance measure reported on classification models is ROC-AUC. But when I look at implementations, the scoring function ...

alepfu

61

asked May 2, 2024 at 18:06

1vote

1answer

91views

How do I identify overffiting when using GridSchearCV?

For context, I'm using Scikit Learn's GridSearchCV to find the best Hyperparameters of a Decision Tree. I believe I understand Train, Validation, and Test sets and overfitting concepts when applied ...

Lisana Daniel

13

asked May 1, 2024 at 17:09

0votes

0answers

23views

How to use cross validation to select/evaluate model with probability score as the output?

Initially I was evaluating my models using cross_val with out-of-pocket metrics such as precision, recall, f1 score, etc, or with my own metrics defined in ...

szheng

21

asked Apr 29, 2024 at 19:58

1vote

1answer

175views

integration of Feature Selection in Pipeline

I have noticed integrating feature selection in a pipeline alters results. Pipeline 1 gives slightly different results with pipeline 2. Why should this be so? Pipeline 2 ...

wwnde

113

asked Jul 7, 2023 at 0:48

0votes

1answer

90views

error when using KFold() and roc_auc metric

why cross_val_score(pipe,X,y,scoring="roc_auc",cv=StratifiedKFold()) works just fine and when using KFold() like ...

jxqbbb

61

asked Jun 12, 2023 at 22:26

0votes

1answer

629views

Tuned model has higher CV accuracy, but a lower test accuracy. Should I use the tuned or untuned model?

I am working on a classification problem using Sci Kit Learn and am confused on how to properly tune hyper parameters to get the "best" model. Before any tuning, my logistic regression ...

d0dg3r_k1d

1

asked Feb 21, 2023 at 18:44

0votes

1answer

102views

how do I test if overfitting exists when I use cross_val_score method?

I got the following code form a book on xgboost. I wonder whether this is a correct way of analyzing cross validation score for overfitting purposes. mean accuracy is 81 which can be okay. but what if ...

Mehmet Deniz

41

asked Feb 19, 2023 at 10:34

1vote

1answer

853views

Does sklearn perform feature selection within cross validation?

I would like to add a feature selector on my pipeline and use gridsearchcv to tune both the hyperparameters of the selector and the classifier(s). I am wondering if sklearn performs feature selection ...

Antonios Sarikas

191

asked Jan 8, 2023 at 22:59

0votes

1answer

1kviews

Is there any benefit to using cross validation from the XGBoost library over sklearn when tuning hyperparameters?

The XGBoost library has its own implementation of cross validation through xgboost.cv(). It looks like it requires data be stored as a DMatrix. Instead of using <...

Eli

101

asked Dec 28, 2022 at 20:16

0votes

1answer

87views

Confusion regarding K-fold Cross Validation

In K fold cross validation, we divide the dataset into k folds, where we train the model on k-1 folds and test the model on the remaining fold. We do so until all the folds were assigned as the test ...

AAA

145

asked Nov 20, 2022 at 15:05

0votes

1answer

241views

Can I use scikit learn's cross_val_predict with cross_validate?

I am looking to make a visualization of my cross validation data in which I can visualize the predictions that occurred within the cross validation process. I am using scikit learn's cross_validate to ...

lambdaChops

3

asked Nov 16, 2022 at 21:10

4votes

0answers

92views

Does ROC AUC different between crossval and test set indicate overfitting or other problem?

I am training a composite model (XGBoost, Linear Regression, and RandomForest) to predict injured people probability. Well, the results of cross-validation with 5 folds. Well, I can see any problem ...

GregOliveira

116

asked Sep 13, 2022 at 13:52

Stack Exchange Network

All Questions

Confused about use of random states for training models in scikit

Why does scikit's cross-validation return a negative R^2 for my strongly correlated data

Sklearn EstimatorCV vs GridSearchCV

Scoring function in cross-validation often left default

How do I identify overffiting when using GridSchearCV?

How to use cross validation to select/evaluate model with probability score as the output?

integration of Feature Selection in Pipeline

error when using KFold() and roc_auc metric

Tuned model has higher CV accuracy, but a lower test accuracy. Should I use the tuned or untuned model?

how do I test if overfitting exists when I use cross_val_score method?

Does sklearn perform feature selection within cross validation?

Is there any benefit to using cross validation from the XGBoost library over sklearn when tuning hyperparameters?

Confusion regarding K-fold Cross Validation

Can I use scikit learn's cross_val_predict with cross_validate?

Does ROC AUC different between crossval and test set indicate overfitting or other problem?

Hot Network Questions

All Questions

Related Tags